alexsusanu@docs:Core Application Lifecycle Management in Kubernetes: A Deep Dive $
alexsusanu@docs
:~$ cat Core Application Lifecycle Management in Kubernetes: A Deep Dive.md

HomeNOTES → Core Application Lifecycle Management in Kubernetes: A Deep Dive

Core Application Lifecycle Management in Kubernetes: A Deep Dive

The Fundamental Challenge: Why Kubernetes Exists

Before diving into deployment strategies, it's crucial to understand why Kubernetes was created. Traditional application deployment was like managing a single server where you'd manually install software, configure it, and pray nothing breaks. When something did break, you'd scramble to fix it while your application was down.

Kubernetes solves the fundamental problem of distributed application management - running applications across multiple machines reliably, with automatic recovery, scaling, and updates. It treats your infrastructure as a pool of resources rather than individual servers.

Understanding the Building Blocks

Pods: The Atomic Unit

A Pod is the smallest deployable unit in Kubernetes. Think of it as a "wrapper" around one or more containers that share storage and network.

Why Pods exist instead of just containers?

apiVersion: v1
kind: Pod
metadata:
  name: web-pod
spec:
  containers:
  - name: web-server
    image: nginx:1.21
    ports:
    - containerPort: 80
  - name: log-collector
    image: fluent/fluent-bit
    volumeMounts:
    - name: shared-logs
      mountPath: /var/log
  volumes:
  - name: shared-logs
    emptyDir: {}

This Pod runs two containers that share the same network (localhost) and volume. The log collector can access nginx logs directly. You couldn't achieve this tight coupling with just containers.

ReplicaSets: Ensuring Availability

ReplicaSets ensure a specified number of Pod replicas are running at any time.

Why not just run Pods directly?

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: web-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.21

If you run Pods directly, when a node fails, your Pod dies permanently. ReplicaSets monitor and replace failed Pods automatically. It's like having a supervisor that ensures you always have exactly 3 copies of your application running.

Deployments: The Management Layer

Deployments manage ReplicaSets and provide declarative updates.

Why use Deployments instead of ReplicaSets directly?

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.21
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1

Deployments provide versioning and rollback capabilities. When you update your application, Deployment creates a new ReplicaSet while gradually scaling down the old one. This gives you controlled, reversible updates.

Deployment Strategies: The How and Why

Rolling Updates: The Default Strategy

What it is: Gradually replace old Pods with new ones, maintaining service availability.

Why use it: Zero-downtime updates for most applications. It's the safest general-purpose strategy.

Deep Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  replicas: 6
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
      - name: api
        image: myapp:v1.0
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 2    # Never have fewer than 4 pods
      maxSurge: 2          # Never have more than 8 pods

The Process:

  1. Current state: 6 pods running v1.0
  2. Create 2 new pods with v1.1 (now 8 total)
  3. Wait for readiness probes to pass
  4. Terminate 2 old pods (back to 6 total)
  5. Repeat until all pods are v1.1

Why these settings matter:

  • maxUnavailable: 2: Ensures you never drop below 4/6 capacity (66%)
  • maxSurge: 2: Limits resource usage during updates
  • readinessProbe: Prevents routing traffic to unhealthy pods

When to use: Most web applications, APIs, microservices where brief mixed versions are acceptable.

Blue-Green Deployments: The Safe Bet

What it is: Maintain two identical production environments, switching traffic between them.

Why use it: Complete isolation between versions, instant rollback capability, perfect for high-stakes updates.

Deep Example:

# Blue environment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-blue
  labels:
    version: blue
spec:
  replicas: 5
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: app
        image: myapp:v1.0

---
# Green environment (new)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-green
  labels:
    version: green
spec:
  replicas: 5
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: app
        image: myapp:v1.1

---
# Service switches between blue and green
apiVersion: v1
kind: Service
metadata:
  name: app-service
spec:
  selector:
    app: myapp
    version: blue  # Change this to 'green' to switch
  ports:
  - port: 80
    targetPort: 8080

The Process:

  1. Deploy green environment alongside blue
  2. Test green thoroughly in isolation
  3. Switch service selector from blue to green
  4. Monitor for issues
  5. If problems arise, switch back to blue instantly
  6. Decommission blue after confidence in green

Why this approach:

  • Resource cost: Doubles infrastructure during deployment
  • Database challenges: Both versions might need the same database
  • Perfect for: Critical applications, major version changes, regulated environments

Real-world example: A banking application updating its payment processing system would use blue-green to ensure zero transaction loss and instant rollback capability.

Canary Deployments: The Gradual Approach

What it is: Deploy new version to a small subset of users, gradually increasing traffic.

Why use it: Minimize blast radius of issues, gather real-world performance data, perfect for incremental confidence building.

Deep Example with Istio:

# Original deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-v1
spec:
  replicas: 9
  selector:
    matchLabels:
      app: myapp
      version: v1
  template:
    metadata:
      labels:
        app: myapp
        version: v1
    spec:
      containers:
      - name: app
        image: myapp:v1.0

---
# Canary deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-v2
spec:
  replicas: 1  # Start small
  selector:
    matchLabels:
      app: myapp
      version: v2
  template:
    metadata:
      labels:
        app: myapp
        version: v2
    spec:
      containers:
      - name: app
        image: myapp:v2.0

---
# Traffic splitting with Istio
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: app-vs
spec:
  http:
  - match:
    - headers:
        canary:
          exact: "true"
    route:
    - destination:
        host: app-service
        subset: v2
  - route:
    - destination:
        host: app-service
        subset: v1
      weight: 90
    - destination:
        host: app-service
        subset: v2
      weight: 10  # 10% of traffic to canary

The Progressive Process:

  1. Week 1: Deploy v2 with 1 replica (10% traffic)
  2. Week 2: Monitor metrics, scale to 3 replicas (25% traffic)
  3. Week 3: If healthy, scale to 5 replicas (50% traffic)
  4. Week 4: Full deployment (100% traffic)

Why this timeline:

  • Week 1: Catch obvious bugs with minimal impact
  • Week 2: Observe resource usage patterns
  • Week 3: Validate performance under load
  • Week 4: Complete confidence in stability

Advanced Canary with Flagger:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: app-canary
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app
  service:
    port: 80
  analysis:
    interval: 1m
    threshold: 5
    maxWeight: 50
    stepWeight: 10
    metrics:
    - name: request-success-rate
      thresholdRange:
        min: 99
    - name: request-duration
      thresholdRange:
        max: 500
  webhooks:
  - name: load-test
    url: http://flagger-loadtester.test/

This automatically promotes the canary based on success rate and response time metrics, rolling back if thresholds aren't met.

Scaling Applications: The Dynamic Response

Horizontal Pod Autoscaler (HPA)

Why auto-scaling matters: Manual scaling is reactive and error-prone. By the time you notice high CPU, users are already experiencing slow response times.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-deployment
  minReplicas: 3
  maxReplicas: 100
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60

Why these settings:

  • Multiple metrics: Prevents scaling based on a single misleading metric
  • Stabilization windows: Prevents thrashing during traffic spikes
  • Asymmetric scaling: Scale up quickly (100% in 60s), scale down slowly (10% per minute)
  • Custom metrics: Requests per second is often more meaningful than CPU for web apps

Vertical Pod Autoscaler (VPA)

Why VPA exists: Right-sizing containers prevents resource waste and ensures performance.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-deployment
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: app
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2
        memory: 4Gi
      controlledResources: ["cpu", "memory"]

Why these limits:

  • minAllowed: Prevents under-provisioning that causes crashes
  • maxAllowed: Prevents runaway resource consumption
  • updateMode: Auto: Automatically restarts pods with new resource requests

Rollouts and Rollbacks: The Safety Net

Deployment History and Rollbacks

Why rollback capability is critical: Even with thorough testing, production environments reveal issues testing cannot predict.

# View rollout history
kubectl rollout history deployment/app-deployment

# Detailed history of a specific revision
kubectl rollout history deployment/app-deployment --revision=3

# Rollback to previous version
kubectl rollout undo deployment/app-deployment

# Rollback to specific revision
kubectl rollout undo deployment/app-deployment --to-revision=2

# Monitor rollback progress
kubectl rollout status deployment/app-deployment

Advanced Rollout Configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-deployment
  annotations:
    deployment.kubernetes.io/revision: "3"
spec:
  revisionHistoryLimit: 10  # Keep 10 previous versions
  progressDeadlineSeconds: 600  # Fail deployment after 10 minutes
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    spec:
      containers:
      - name: app
        image: myapp:v1.3
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          failureThreshold: 3

Why these settings matter:

  • revisionHistoryLimit: Balances storage usage with rollback options
  • progressDeadlineSeconds: Prevents infinite hanging deployments
  • Liveness vs Readiness probes:
    • Liveness: Restarts unhealthy containers
    • Readiness: Stops routing traffic to containers that aren't ready

Real-World Integration Example

Here's how these concepts work together in a production e-commerce application:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ecommerce-api
  labels:
    app: ecommerce-api
    tier: backend
spec:
  replicas: 5
  revisionHistoryLimit: 5
  selector:
    matchLabels:
      app: ecommerce-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 2
  template:
    metadata:
      labels:
        app: ecommerce-api
        version: v2.1
    spec:
      containers:
      - name: api
        image: ecommerce/api:v2.1
        ports:
        - containerPort: 8080
        env:
        - name: DB_CONNECTION_POOL_SIZE
          value: "20"
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 1000m
            memory: 2Gi
        livenessProbe:
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 10
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /api/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 5
          failureThreshold: 3

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ecommerce-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ecommerce-api
  minReplicas: 5
  maxReplicas: 50
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 120
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60

Why this configuration for e-commerce:

  • 5 minimum replicas: Ensures availability during traffic spikes
  • Rolling update with maxSurge: 2: Faster deployments during business hours
  • Conservative scaling: Prevents over-provisioning during flash sales
  • Longer stabilization: Prevents rapid scaling during checkout surges

Key Takeaways

  1. Layered Architecture: Pods → ReplicaSets → Deployments each solve specific problems
  2. Strategy Selection: Choose based on risk tolerance and infrastructure constraints
  3. Proactive Monitoring: Use multiple metrics for scaling decisions
  4. Safety First: Always configure rollback mechanisms before deploying
  5. Real-world Testing: Canary deployments provide the best balance of safety and speed

The beauty of Kubernetes lifecycle management is that it transforms application deployment from a manual, error-prone process into a declarative, repeatable, and self-healing system. Each component exists to solve specific real-world problems that emerge when running applications at scale.

Last updated: 2025-08-26 20:00 UTC